Implementation of Multidimensional Index Structures for Knowledge Discovery in Relational Databases
نویسندگان
چکیده
Efficient query processing is one of the basic needs for data mining algorithms. Clustering algorithms, association rule mining algorithms and OLAP tools all rely on efficient query processors being able to deal with high-dimensional data. Inside such a query processor, multidimensional index structures are used as a basic technique. As the implementation of such an index structures is a difficult and time-consuming task, we propose a new approach to implement an index structure on top of a commercial relational database system. In particular, we map the index structure to a relational database design and simulate the behavior of the index structure using triggers and stored procedures. This can easily be done for a very large class of multidimensional index structures. To demonstrate the feasibility and efficiency, we implemented an X-tree on top of Oracle 8. We ran several experiments on large databases and recorded a performance improvement of up to a factor of 11.5 compared to a sequential scan of the database.
منابع مشابه
Addressing Internal Consistency with Multidimensional Conditional Functional Dependencies
Conditional functional dependencies(CFDs) have recently been introduced as a novel approach for capturing the external consistency of relational data by comparing tuples. They define bindings of semantically related values that originate from flat domains. In real world applications often domains with multidimensional metadata have to be addressed and data are not only stored in relational data...
متن کاملAttribute-oriented Induction in Ob Ject-oriented Databases
Knowledge discovery in databases is the nontrivial extraction of implicit, previously unknown, and potentially useful information from data such that the extracted knowledge may facilitate deductive reasoning and query processing in database systems. This branch of study has been ranked among the most promising topics for database research for the 1990s. Due to the dominating influence of relat...
متن کاملA Paradigm Shift in Database Optimization: from Indices to Aggregates
Multidimensional databases and online analytical processing (OLAP) tools provide new ways for decision-makers to access data and retrieve information. This paper examines the differences between the optimization techniques that database designers need to consider when developing relational versus multidimensional data warehouses. The multidimensional data storage model allows for large numbers ...
متن کاملKnowledge Discovery in Spatial Databases
Both, the number and the size of spatial databases, such as geographic or medical databases, are rapidly growing because of the large amount of data obtained from satellite images, computer tomography or other scientific equipment. Knowledge discovery in databases (KDD) is the process of discovering valid, novel and potentially useful patterns from large databases. Typical tasks for knowledge d...
متن کاملDesign and Implementation of a Scalable Parallel System for Multidimensional Analysis and OLAP
Multidimensional Analysis and On-Line Analytical Processing (OLAP) uses summary information that requires aggregate operations along one or more dimensions of numerical data values. Query processing for these applications require different views of data for decision support. The Data Cube operator provides multi-dimensional aggregates, used to calculate and store summary information on a number...
متن کامل